question: What is the Markov property? option 1: The agent learns from trial-and-error interaction with the environment option 2: The environment provides scalar rewards to the agent option 3: The agent receives observations from the environment option 4: The future is conditionally independent of the past given the current state option 5: The agent uses a policy to choose actions 